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Inferring the internal interaction patterns of a complex dynamical system is a challenging problem. 
Traditional methods often rely on examining the correlations among the dynamical units. However, in 
systems such as transcription networks, one unit's variable is also correlated with the rate of change of 
another unit's variable. Inspired by this, we introduce the concept of derivative- variable correlation, and use 
it to design a new method of reconstructing complex systems (networks) from dynamical time series. Using 
a tunable observable as a parameter, the reconstruction of any system with known interaction functions is 
formulated via a simple matrix equation. We suggest a procedure aimed at optimizing the reconstruction 
from the time series of length comparable to the characteristic dynamical time scale. Our method also 
provides a reliable precision estimate. We illustrate the method's implementation via elementary dynamical 
models, and demonstrate its robustness to both model error and observation error. 
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Modern world relies heavily on complex interconnected systems, such as Internet and social media or 
transport and communication networks. In addition to technological utilities, complex systems are 
found on various scales in nature and society: gene regulation, protein and metabolic networks, food 
webs, economic and social networks, climate. Among the foremost problems here is the development of methods 
for reconstructing the structure (topology) of real networks from the observable data. Topology, in combination 
with the inter-unit interactions, determine the function of complex dynamical systems (networks) 1 . 
Reconstruction methods are often developed within the contexts of particular fields, relying on domain -specific 
approaches. These include gene regulations 2 5 , metabolic systems 6 , neuroscience 7 , or social networks 8 . On the 
other hand, theoretical reconstruction concepts are based on paradigmatic dynamical models such as phase 
oscillators 912 , some of which have been experimentally tested 13,14 . In a similar context, techniques for detecting 
hidden nodes in networks are being investigated 15 . A class of general reconstruction methods exploit the time 
series obtained by quantifying the system behaviour. Some of them assume the knowledge of the internal 
interaction functions 1617 , while others do not 18 . Network couplings can be examined via an information-theoretic 
approach 19 . The advantage of these methods is that they are non-invasive, i.e. require no interfering with the on- 
going dynamics. 

Reconstruction methods are often based on examining the dynamical correlations among the system units 
(network nodes) 12 . On the other hand, models of complex dynamical systems such as Eq.l, are usually based on 
expressing the time derivative of a node as a combination of a local and a coupling term. Inspired by this, we 
propose a non-invasive reconstruction method, relying on the concept of derivative-variable correlation. Our 
method assumes the dynamical time series to be available as measured observables, and the interaction functions 
to be known. We present our theory in a general form, extending our initial results 20 . As we show, our approach 
allows for the reconstruction precision to be estimated, indicating the level of noise in the data, or possible 
mismatches in the knowledge of the interaction functions. 

Results 

The reconstruction method. We consider a general complex network consisting of N nodes, described by their 
dynamical states x ; (r). Its time evolution is governed by: 



Xi=/;(xO+X!Vv(*/)> 



(1) 



where the function^ represents the local dynamics for each node, and hj models the action of the node j on other 
nodes. The network topology is encoded in the adjacency matrix A«, specifying the strength with which the node; 
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Figure 1 | Adjacency matrices of two examined dynamical networks. Adjacency matrix A for the first example (a), and the second example (b). 
Colorbars (shades) indicate the interaction strength. Two different colorbars in (b) stand for two different interaction types (see text). 



acts on the node i. We assume that: (i) the complex system evolves 
according to Eq.l, (h) the interaction functions/] and hj are precisely 
known, and (Hi) a discrete trajectory consisting of L values x,(f i), . . . , 
Xj(t L ) is known for each node. The measurements of x, are separated 
by the uniform observation interval 5 t defining the time series 
resolution. We seek to reconstruct the unknown adjacency matrix 
Ay = A under these assumptions. 

The starting point is to define the following correlation matrices, 
using the observable g(x) whose role will be explained later: 



B 

C 
E 



(g{ x i)Xj), 



(2) 



where (■) denotes time-averaging ( r ) ~ j^2-l _ l r {t m ). Inserting 
into the Eq.l, we obtain the following linear relation between the 
correlation matrices: 



A=E _1 -(B-C), 



(3) 



which is our main reconstruction equation, applicable to any system 
with dynamics given by Eq.l. Time series are to be understood as the 
available observables, allowing for matrices in Eq.2 to be computed 
for any g. For the infinitely long dynamical data, reconstruction is 
always correct for any non-trivial choice of g. For short time series, 
representing experimentally realistic scenarios, the reconstruction is 
always approximate, and its precision crucially depends on the 
choice oig (usually, correlations are defined as central moments with 
averages subtracted - instead, we are here not interested in correla- 
tions per se, but in the reconstruction according to Eq.3, for which 
the subtraction of averages is not needed). To be able to quantify the 
reconstruction precision, we need to equip ourselves with the 
adequate measures. To differentiate from the original adjacency 
matrix A, we call the reconstructed matrix Ry = R, and express the 
matrix error as: 



{R.j-A„) 



\ E„A| 



(4) 



Of course, each R is computed according to Eq.3 in correspondence 
with the chosen g. However, since the matrix A is unknown, we have 
to introduce another precision measure, based only on the available 
data (time series and interaction functions). A natural test for each R 
is to quantify how well does it reproduce the original data Xj(t m ). We 
apply the following procedure: start the dynamics from x,(fi) and run 
it using R until t = f 2 ; denote thus obtained values j,(f 2 ); re-start the 
run from Xj(t 2 ) and run until t = t 3 , accordingly obtaining y,(t 3 ), and 
so on. The discrepancy between the reconstructed time series yj(t m ) 



and the original Xj(t m ) is an explicit measure of the reconstruction 
precision, based solely on the available data. We name it trajectory 
error At, and define it as follows: 



A r = -V 



((xi-yif) 



tffrV <(*-<*» > 



(5) 



Different choices of the observable g lead to different R, with different 
precisions expressed through errors A r and A^. As we show below, 
these two error measures are related, meaning that small A r suggests 
small A A . The function g hence plays the role of a tunable parameter, 
which can be used to optimize the reconstruction. By considering 
many R-s obtained through varying g, we can single out R-s with the 
minimal A r to obtain the best reconstruction. 

Implementation of the method. To illustrate the implementation of 
our method, we begin by formulating a complex dynamical system as 
a network with N = 6 nodes. 17 directed links, modelling inter-node 
interactions, are put between randomly chosen pairs of nodes. We 
examine two different examples of interactions. As our first example, 
we consider the Hansel- Sompolinsky model, describing the 
dynamics of firing in neural populations 21 . It is defined by the 
interaction functions^ = — x and hi = tanh x which are fixed for 
all nodes. The adjacency matrix is specified by assigning positive and 
negative weights to the links, randomly chosen from [ — 10, 10], as 
shown in Fig. 1 a. Weights model the varying strengths of interaction. 
Starting from random initial conditions, the resulting system is 
integrated from t = 0 to t = 4. During the run, 20 values of x, are 
stored for each node, equally spaced with 8 t = 0.2. The obtained time 
series, shown in Fig. 2, are rather short compared to the characteristic 
time scale and the system size. 

We now use these data to reconstruct the original adjacency 
matrix by employing the procedure described above. We consider 
a set of 10 4 test-functions g, each composed of first 10 Fourier har- 
monics 



g(x) — ^2 l a k sin(fcx) + bk cos(fcx) 



(6) 



The coefficients and are randomly selected from [0, 100] with 
the log-uniform probability (to emphasize smaller coefficient 
values). This is implemented by selecting each Fourier coefficient 
via io 20043213 ' < ™" rf — 1.0, where rand is a random number between 
0 and 1. A typical function thus constructed for each choice of and 
bk will have all 10 Fourier components, but one (or at most few) will 
be well pronounced. Functions are then normalized to the range of 
time series values. Given relatively smooth time series, lower har- 
monics are expected to generally extract more features from data, 
which is why we limit ourselves to the first 10 harmonics. To improve 
the stability of the derivative estimates, we base our calculations on 
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Figure 2 | An example of timeseries. Time series for all 6 nodes produced 
strength jy = 0.4 (see text, cf. Fig. 5a). 

the set of time points z m = (t m+1 — t m )/2. For eachg, the matrix R is 
obtained via Eq.2 and Eq.3, with the invertibility of each E checked by 
virtue of the singular value decomposition. The errors A T and are 
then calculated for each R, and reported as a scatter plot in Fig. 3a. 

The main result of this analysis is a clear correlation between A r 
and A A , particularly pronounced for smaller values of errors. This 
confirms that the best R are among those that display minimal A T . In 
order to identify the best reconstruction and estimate its precision, 
we focus on the 1% of matrices R with the minimal A T , as illustrated 
in Fig. 3a by the dashed vertical line. The variability of R within this 
group can be viewed as the reconstruction precision. Small variability 
indicates the invariance of R to the choice of g, which suggests a good 
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the network Fig. la (black dots). Bars denote the added white noise of 



reconstruction. Large variability of R implies its drastic dependence 
on g, indicating a bad precision. We quantify this by computing the 
mean and the standard deviation for each matrix element of R within 
this group, and identify them, respectively, with the best reconstruc- 
tion value and its precision. In Fig. 4a we report the original A and the 
best reconstruction, along with the respective errorbar for each 
matrix element, describing the reconstruction precision. The recon- 
struction is indeed very good for both zero and non-zero weights (i.e. 
for non-linked and linked node pairs). 

Our second example of node dynamics is a model describing gene 
interactions, with the coupling functions of two types: activation 
ft+ =x 5 /(l +x 5 ) and repression hj~ = l/(l + x 5 ) 22 . Local inter- 
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Figure 3 | Scatter plots of errors A T vs. \ A . Scatter plots of errors A r and A A , in relation with the first and second example, in (a) and (b), respectively. Best 
1% of R-s with the minimal A r , are represented by the dots left of the vertical dashed line. 
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Figure 4 | Network reconstruction with errorbars. Elements of the 
original A (circles), and the best reconstruction (crosses), with the 
corresponding errorbars. First and second example in (a) and (b), 
respectively. 

action are again modeled via/ = —x. The adjacency matrix is based 
on the same network, and defined by assigning a random weight 
from [0, 1] for each link, as shown in Fig. lb. The nodes 1-3 (respect- 
ively, 4-6) act activatorily (repressively) on all nodes that they act 
upon. Again, we run the dynamics from t = 0 to t = 4, obtaining 
another set of time series with 20 points (not shown). The same 
reconstruction procedure is applied, yielding the A r vs. A A scatter 
plot shown in Fig. 3b. Using the same procedure, we obtain the best 
reconstruction and show it in Fig. 4b. The precision is again very 
good, although the relationship between A T and A A is now different, 
since the two examined systems display different degrees of non- 
linearity. This shows that our method applies even in cases of 
strongly non-linear interaction functions, which capture most real 
dynamical scenarios. 



Testing the method's robustness. In order to model the real applica- 
bility of our method, we test its robustness to possible violations of 
the initial assumptions, focusing on the first dynamical example 
(Fig. la). We start with the scenario when the interaction functions 
are not precisely known - we assume a small mismatch in their 
mathematical form (model error). Instead of the original/ = — x 
and hj = tanh x, we take/ = — l.lx and hi = tanh(l.lx) + O.lx. 
The measurements of A r now cannot be expected to converge to 
zero. Nevertheless, we apply the same procedure, and find (a 
weaker) correlation between A T and A A , as shown (by black dots) 
in Fig. 5a. To see the worsening of the precision clearly, grey dots 
show the original non-perturbed scatter plot from Fig. 3a. Dashed 
vertical line shows the part of the error A r which is unavoidable due 
to the presence of the perturbation. We compute it by averaging the 
difference between the original and the perturbed interaction 
functions over the y-range of the time series (for the function h we 

tanh(x)— tanh(l.bc)— O.lx , . , 

average over the y-range of the time 

series, and equivalently for the function /; since the two 
contributions can not be simply added, we consider only the larger 
one, in this case h, which gives the lower bound on such error). Such 
error is not related to the performance of our method, so we show it 
in the figure in order to isolate more clearly the part of the error that 
in fact arises from our method. The remaining A T is similar to the A T 
occurring in the non-perturbed case. This demonstrates that our 
method works even under perturbed conditions. The worsening of 
the reconstruction precision is what expected from the nature of the 
perturbation, meaning that our method makes no additional 
"unexpected" errors in the perturbed conditions. The best 
reconstruction and the corresponding errorbars are computed as 
before and shown in Fig. 6a. The errorbars are larger and the 
reconstruction precision worsens. Still, the essential fraction of 
elements of A are within the respective errorbars. The decline of 
precision is controllable, since it is clearly signalized by the size 
of the errorbars. This could be used to generalize the method in 
the direction of detecting the interaction functions as well. Each 
best R would be accompanied by the best guesses for /■ and hp 
meaning that different network topologies, reproducing the data 
equally well, would come in combination with different / and hp 

To test the third assumption of our theory, we take the time series 
to be not precisely known due to observation errors. Uncorrelated 
white noise of intensity n = 0.4 is added, perturbing each value of the 
time series. Instead of the original data, we now consider one real- 
ization of the noisy data, as illustrated in Fig. 2 (interaction functions 
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Figure 5 | Scatter plots of A r vs. \ A for model and observation error scenarios. Scatter plots of errors A x and A A (black dots), for the model error 
scenario in (a) and the observation error scenario in (b). Original non-perturbed scatter plotfromFig. 3a is shown in gray for comparison. Vertical dashed 
lines depicts the part of the A r error which is unavoidable in the presence of the perturbation (see text). 
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Figure 6 | Network reconstruction with errorbars for the model and 
observation error scenarios. Elements of the original A (circles), and the 
best reconstruction (crosses), with the corresponding errorbars (first 
example). Model error and observation error scenarios in (a) and (b), 
respectively. 

are the original ones). The central problem now is the computation of 
the derivatives, which are extremely sensitive to the noise. We 
employ the Savitzky-Golay smoothing filter 23 as a standard tech- 
nique of data de-noising, which allows for a good derivative estima- 
tion. Since the time series are short, we apply the smallest smoothing 
parameters (polynomial degree 2, window size 2). The reconstruc- 
tion procedure is applied as before, using smoothed derivatives to 
compute matrix B in Eq.2. The scatter plot of A T vs A A is shown in 
Fig. 5b, again compared with the original plot, and with the perturba- 
tion-induced unavoidable error indicated by the vertical line (com- 
puted as before, this time averaging the perturbation value over the 
y-range of the time series). The worsening of the precision is of a 
similar magnitude as in the model error scenario. The best recon- 
struction and the corresponding errorbars are reported in Fig. 6b. 
Note that the precision is again correctly captured by the size of the 
errorbars. In two cases from Fig. 6, the precision does not decline 
uniformly for all links. The analysis above shows that our reconstruc- 
tion method is reasonably robust to both model and observation 
error. We found this robustness to be qualitatively independent of 
the realization of both these errors. 

Discussion and Conclusions 

We presented a method of reconstructing the topology of a general 
complex dynamical system from the time series with known inter- 
action functions. Through conceptually novel approach, our method 
is formulated as an inverse problem using linear systems formalism 24 . 
Rather than relying on the correlations between the observed variables, 
it is based on the correlations between the variables and their time 
derivatives. Our method involves two important factors: it applies to 
the data that is relatively short, i.e. of the length comparable to the 
system size and to the characteristic time scale; and, it yields the error- 
bars as a by-product, correctly reflecting the reconstruction precision. 

On the other hand, our theory relies on knowing (at least approxi- 
mately) both the dynamical model Eq.l and the interaction func- 
tions. While these assumptions might limit the immediate 



applicability of our method, our idea presents a conceptual novelty, 
potentially leading towards a more general and applicable recon- 
struction method. For example, we expect applicability in studies 
of interacting neurons in slices or cultures, where the properties of 
the individual neurons (i.e. functions /and h) can be relatively well 
established, while the adjacency matrix is unknown. In contrast, the 
application to problems such as brain fMRI activity patterns, where 
even the existence of a dynamical model like Eq.l is questionable, 
appears at present not possible. 

Our theory includes choosing the tunable observables g, which 
allow for the reconstruction to be optimized within the constrains 
of any given data. The question of constructing the optimal g which 
extracts the maximal extractable information, remains open. Our 
algorithm can be reiterated: once the 1% of the best R-s are found, 
one can examine the functions g leading to those 1%, and repeat the 
procedure, sampling only the neighboring portion of the functional 
space. Alternatively, various evolutionary optimization algorithms 
could be used 25 . An important factor for the method's applicability 
is the dynamical regime behind the time series, which could be regu- 
lar and stable (for example periodic) or chaotic and unstable (starting 
from general initial conditions, the transients are typically irregular). 
The former case is less reconstructible, because of a poor coverage of 
the phase space. In particular, synchronized dynamics, being essen- 
tially non-sensitive to the variations of the coupling coefficients, 
offers very little insight into the structure of the underlying system. 
Increasing the time series length is obviously of no help 20 . In contrast, 
the latter case contains more extractable system information, and is 
potentially more reconstructible. Chaotic dynamics provides a better 
coverage of the phase space, adding new information with increasing 
length of the time series. Another issue is the applicability to large 
networks N<C1, and in particular, the dependence of precision on 
relationship between N and L. This relates to the possibility of quan- 
tifying the network information content of the available data. 

Another limitation of our theory comes from the form of Eq.l. A 
similar theory could be developed for alternative scenarios, such as h 
specified by both source and target nodes. The real challenge here are 
the networks with non-additive inter-node coupling (i.e., the dynam- 
ical contribution to the node i is not a mere sum of neighbours' 
inputs). The key practical problem is that the mathematical forms 
of/ and h are not (precisely) known for many real networks, although 
for certain systems they can be inferred with a reasonable confid- 
ence 4,5 . Noise always hinders the reconstruction, specially via deriv- 
ative estimates. However, longer time series not only bring more 
information, but also allow for a better usage of smoothing. 
Finally, we note that the network reconstruction problem is "oppos- 
ite" of the network design problem. Our method could be employed 
to design a system that displays given dynamics. However, while any 
system with Aj — 0 solves the design problem, in the reconstruction 
theory this creates the permanent issue of isolating the true system 
among those that exhibit A 7- ~ 0. 

Finally, we note that the comparison among the methods aimed at 
inferring the internal structure of dynamical systems is of great 
interest. Besides providing a consistent evaluation of the perform- 
ance of various methods, such comparison could help one choose the 
most suitable method for a given problem. However, the initial hypo- 
theses for various methods are very diverse, which makes the object- 
ive comparison among them very hard, at least at the moment. 
Directly applicable methods 318 rely on experimentally realistic 
assumptions, but often perform poorly. Theoretical reconstruction 
ideas (such as the present or 9-12 ) are based on stronger assumptions 
and show better performance, although their concrete applicability is 
for now limited. Nevertheless, this remains an important and inter- 
esting avenue for future work. 
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